Learning about reward and expected values of choice alternatives is critical for adaptive behavior. Although human choice is affected by the presentation frequency of reward-related alternatives, this is overlooked by some dominant models of value learning. For instance, the delta rule learns average rewards, whereas the decay rule learns cumulative rewards for each option. In a binary-outcome choice task, participants selected between pairs of options that had reward probabilities of .65 (A) versus .35 (B) or .75 (C) versus .25 (D). Crucially, during training there were twice as many AB trials as CD trials, therefore option A was associated with higher cumulative reward, while option C gave higher average reward. Participants then decided between novel combinations of options (e.g., AC). Participants preferred option A, a result predicted by the Decay model, but not the Delta model. This suggests that expected values are based more on total reward than average reward.